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of models also allows for different parameterizations for the conditional distribution 
of the response variables given the latent traits, depending on both the type of link 
, function and the constraints imposed on the discriminating and the difficulty item 

^ parameters. We illustrate how the proposed class of models may be estimated by the 

maximum likelihood approach via an Expectation-Maximization algorithm, which 
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for the assessment of anxiety and depression. In the first application, we illustrate 
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application, we describe the steps to select a specific model having the best fit in 
our class of IRT models. 
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1 Introduction 



Item Response Theory (IRT) models are commonly used to measure latent traits, through 
the analysis of data deriving from the administration of questionnaires made of items 
with dichotomous or polythomous responses (also known, in the educational setting, as 
dichotomously or polythomously-scored items). Usually, traditional IRT models are based 
on the unidimensionality assumption, which means that all items contribute to measure 
the same latent trait. Moreover, in some cases, the normality assumption of this latent 
trait is explicitly introduced. Unfortunately, in several practical situations both of these 
assumptions are restrictive. Therefore, several extensions of traditional IRT models have 
been proposed in the literature in order to make the models more flexible and realistic. 

Firstly, some authors dealt with multidimensional extensions of IRT models to take 
into account that questionnaires are often designed to measure more than one latent 
trait. Among the main contributions, we remind Duncan and Stenbeck (1987), Agresti 
(1993), and Kelderman and Rijkes (1994), who proposed a number of examples of loglinear 
multidimensional IRT models, Kelderman (1996) for a multidimensional version of the 
Partial Credit Model (Masters, 1982), and Adams et al. (1997) for a wide class of Rasch 
type (Rasch, 1960; Wright and Masters, 1982) extended models; see Reckase (2009) for a 
thorough overview of this topic. 

Another advance in the IRT literature concerns the assumption that the population 
under study is composed by homogeneous classes of individuals who have very similar 
latent characteristics (Lazarsfeld and Henry, 1968; Goodman, 1974). In some contexts, 
where the aim is clustering individuals, this is a convenient assumption; in health care, 
for instance, by introducing this assumption we single out a certain number of clusters 
of patients receiving the same clinical treatment. Secondly, this assumption allows us to 
estimate the model in a semi-parametric way, namely without formulating any assump- 
tion on the latent trait distribution. Moreover, it is possible to implement the maximum 
marginal likelihood method making use of the Expectation-Maximization (EM) algorithm 
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(Dempster et al., 1977), skipping in this way tlic problem of intractability of multidimen- 
sional integral which characterizes the marginal likelihood when a continuous distribution 
is assumed for the latent variable. At this regards, Christensen et al. (2002) outhned, 
through a simulation study, the computational problems encountered during the estima- 
tion process of a multidimensional model based on a multivariate normally distributed 
ability. See also Masters (1985), Langheine and Rost (1988), Heinen (1996), and Formann 
(2007) for a comparison between traditional IRT models with those formulated by a latent 
class approach. For some examples of discretized variants of IRT models we also remind 
Lindsay et al. (1991), Formann (1992), Hoijtink and Molenaar (1997), Vermunt (2001), 
and Smit et al. (2003). Another interesting example of combination between the IRT 
approach and latent class approach is represented by the mixed Rasch model for binary 
and ordinal polythomous data (Rost, 1990, 1991; von Davier and Rost, 1995), builded as 
a mixture of latent classes with a separate Rasch model assumed to hold within each of 
these classes. 

As concerns the combination of the two above mentioned extensions, in the context of 
dichotomously-scored items Bartolucci (2007) proposed a class of multidimensional latent 
class (LC) IRT models, where: (i) more latent traits are simultaneously considered and 
each item is associated with only one of them (between-item multidimensionality; for de- 
tails see Adams et al. (1997) and Zhang (2004)) and (ii) these latent traits are represented 
by a random vector with a discrete distribution common to all subjects (each support point 
of such a distribution identifies a different latent class of individuals). Moreover, in this 
class of models either a Rasch (Rasch, 1960) or a two-parameter logistic (2PL) param- 
eterization (Birnbaum, 1968) may be adopted for the probability of a correct response 
to each item. Similarly to Bartolucci (2007), von Davier (2008) proposed the diagnostic 
model, which, as main difference, assumes fixed rather than free abilities. An interest- 
ing comparison of multidimensional IRT models based on continuous and discrete latent 
traits was performed by Haberman et al. (2008) in terms of goodness of fit, similarity of 
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parameter estimates and computational time required. 

The aim of the present paper is to illustrate a wide class of multidimensional LC 
IRT models that accommodates both binary and ordinal polythomous items in the same 
framework and the R package MultiLCIRT to deal with these models (Bartolucci et al., 
2012). The class of model at issue includes that proposed by Bartolucci (2007) as a 
special case and is formulated so that different parameterizations may be adopted for the 
conditional distribution of the response variables, given the latent traits. We mainly refer 
to the classification criterion proposed by Molenaar (1983); see also Agresti (1990) and 
Van der Ark (2001). Relying on the type of link function, it allows to discern among: 
{i) graded response models, based on global (or cumulative) logits and {ii) partial credit 
models, which make use of local (or adjacent category) logits. For each of these link 
functions, we explicitly consider the possible presence of constraints on the discriminating 
and the difficulty item parameters. As concerns the first element, we take into account the 
possibility that all items have the same discriminating power against the possibility that 
they discriminate differently. Moreover, we discern the case in which each item differs from 
the others for different distances between the difficulties of consecutive response categories 
and the special case in which the distance between difficulty levels from category to 
category is the same for all items. On the basis of the choice of all the mentioned features 
(i.e., type of link function, item discriminating parameters, and item difficulties), different 
item parameterizations are defined. We show how these parameterizations result in an 
extension of traditional IRT models, by introducing assumptions of multidimensionality 
and discreteness of latent traits. 

In order to estimate each model in the proposed class, we rely on an EM algorithm. 
Moreover, special attention is given to the model selection procedure, that aims at choos- 
ing the optimal number of latent classes, the type of link function, the parameterization 
for the item discriminating and difficulty parameters, and the number of latent dimensions 
and the allocation of items within each dimension. Algorithms for the estimation of the 
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proposed class of models have been implemented in the package MultiLCIRT (Bartolucci 
et al., 2012). We describe the main functions of this package, with special attention to 
input required and output supplied. 

Several other R packages are suitable for estimating IRT models and their extensions, 
but to our knowledge neither of them treat multidimensionality and discreteness of latent 
traits at the same time. Among others, we remind packages mirt (Chalmers, 2012), 
plink (Weeks, 2011), irtProb (Raiche, 2011) (only for binary items), and plRasch (Li 
and Hong, 2007) (only for Rasch-type models), for the estimation of multidimensional 
IRT models. Moreover, we cite package mixRasch (Willse, 2009) for the discreteness 
assumption of latent trait (only for Rasch-type models). 

In order to illustrate the proposed class of models and the use of MultiLCIRT pack- 
age, we describe two applications. The first one is about a dataset extrapolated by a 
larger dataset collected in 1996 by the Educational Testing Service within the National 
Assessment of Educational Progress (NAEP) project. Following Bartolucci (2007), we 
illustrate how performing a hierarchical model-based clustering of a set of binary items 
in homogeneous groups, each of them measuring different latent traits. The second ap- 
plication concerns a dataset collected by a set of ordinal polythomous items on anxiety 
and depression of oncological patients, and formulated following the "Hospital Anxiety 
and Depression Scale" (HADS) developed by Zigmond and Snaith (1983). Through this 
application, the steps of the model selection procedure are illustrated and the character- 
istics of each latent class, in terms of estimated levels of the latent traits, are described 
with reference to the selected model. 

The reminder of this paper is organized as follows. In Section 2 we describe the 
proposed class of multidimensional LC IRT models and we focus on the different types 
of parameterizations that can be specified. Section 3 is devoted to maximum likelihood 
estimation and to model selection. Section 4 is devoted to illustrate functions implemented 
in MultiLCIRT package. In Section 5, the proposed class of models is illustrated through 
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the analysis of two real datasets with specific reference to MultiLCIRT package, whereas 
some final remarks are reported in Section 6. 

2 The class of multidimensional latent class IRT mod- 
els 

In the following, we describe the proposed class of models by illustrating the different 
specifications due to the type of link function and the constraints on the item parameters. 
Then, in order to efficiently implement the parameter estimation, the formulation of the 
class of models is described in matrix notation. 

2.1 The model 

Let Xj denote the response variable for the j-th item of the questionnaire, with j — 
1, . . . ,r. We assume that each item j has Ij categories, indexed by x from to Ij — 1, 
where Ij — 2 corresponds to the case of a binary item {x — 0, 1). Let s be the number 
of different latent traits measured by the items, let = (©i,...,©s)' be a vector of 
latent variables corresponding to these latent traits, and let 6 = {6i, ... ,9s)' denote 
one of its possible realizations. The random vector is assumed to have a discrete 
distribution with k support points, denoted by ^i, . . . and probabilities tti, . . . ,7rfc, 
with TTc = p(0 = ^c)- Moreover, let Sjd be a dummy variable equal to 1 if item j measures 
latent trait of type d and to otherwise, with j — 1, . . . , r and d — 1, . . . , s. We also 
denote the conditional response probability that a subject with latent traits (or abilities) 
levels given by 6 responds by category x to item j as follows 

\j^{d) = p{Xj ^x\& = d), X = 0, . . . , - 1, 

and we let Xj{6) = {XjQ{d), . . . , Aj,/^._i(0))', the elements of which sum up to 1. 



6 



The IRT models that are here of interest may be expressed through the following 
general formulation 

s 

9x{\{0)) = 7i(X] SjaOd - Pjx), J = l,...,r, x=l,...,lj- 1, (1) 

d=l 

where gx{-) is a link function specific of category x and 7^ and f3jx are item parameters, 
usually identified as discriminating and difficulty indices and on which suitable constraints 
need to be assumed. 

The formulation of each model belonging to the proposed class depends on the speci- 
fication on the following three elements: 

1. Type of link function: we consider the link based on: (i) global (or cumulative) logits 
and (a) local (or adjacent categories) logits. In the first case, the link function is 

defined as 

and compares the probability that item response is in category x or higher with the 
probability that it is in a lower category. Moreover, with local logits we have that 

rx rn\-\ 1 , piXj = x\0) 

and then the probability of each category x is compared with the probability of the 
previous category; in the context of IRT models, see also Forcina and Bartolucci 
(2004) . We note that, in case of binary items, the two types of logits coincide. Global 
logits are typically used when the trait of interest is assumed to be continuous but 
latent, so that it can be observed only when each subject reaches a given threshold 
on the latent continuum. On the contrary, local logits are used to identify one or 
more intermediate levels of performance on an item and to award a partial credit 
for reaching such intermediate levels. IRT models based on global logits are also 
known as graded response models, those based on local logits are known as partial 
credit models. 



2. Constraints on the discriminating parameters: we consider: (z) a general situation 
in which each item may discriminate differently from the others and (m) a special 
case in which all the items discriminate in the same way, that is 

7, = 1, j = l,...,r. (2) 

Note that, in both cases, we assume that, within each item, all Ij > 2 response 
categories share the same jj, in order to keep the conditional probabilities away 
from crossing and so avoiding degenerate conditional response probabilities. We 
also observe that, with binary items, case (i) corresponds to a two-parameters lo- 
gistic (2PL) formulation (Birnbaum, 1968), whereas case {ii) is referred to a Rasch 
parameterization (Rasch, 1960). 

3. Formulation of item difficulty parameters: limited to the case of polythomously- 
scored items (i.e., Ij > 2), we consider: (i) a general situation in which the param- 
eters are unconstrained and (ii) a special case in which these parameters are 
constrained so that the distance between difficulty levels from category to category 
is the same for each item (rating scale parameterization). Obviously, the second 
case makes sense when all items have the same number of response categories, that 
is Ij — I, j — 1, ... ,r. This constraint may be expressed as 

Pjx^ I3j + T^, J = 1, . . . , r, X = 0, . . . , Z - 1, (3) 

where indicates the difficulty of item j and Tx is the difficulty of response category 
X for all j. Note that, with binary items, we have = Pj. 

By combining the above constraints, we obtain four different specifications of the item 
parametrization (second member of eqn. 1), based on free or constrained discriminating 
parameters and on a rating scale or a free parameterization for difficulties (Table 1). 
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discriminating difficulty item 

indices levels parameterization 

free free ^jd^d - Pjx) 

free constrained Jj[z2d ^jd^^d - {Pj + Tx)] 

constrained free ^jdOd ^ 0jx 

ct)ust rained couslriiiucd '}Zd'^i<i^<< ^ ('^y + '■'') 

Table 1: Specifications of the item parametrization 



According to the above illustrated item parameterization and also according to the 
type of link function, several different types of multidimensional LC IRT models derive 
which extend to multidimensionality and discreteness of latent traits some well-known 
traditional unidimensional IRT models. For instance, we may define the multidimensional 
LC Graded Response Model (GRM), which is an extension of the GRM by Samejima 
(1969), as 

lfx,~<% = % ^ ^'^P " ^ ^ ^' ■ ■ ■ ' - ^' 

and the multidimensional LC Rating Scale Model (RSM), which is an extension of the 
RSM by Andrich (1978), as 

p(xfl X -%^^d) = i - + a; = l,...,/-L (5) 

Note that when Ij = 2, j = 1, . . . , r, so that item responses are binary, equations (4) and 
(5) specialize, respectively, in the multidimensional LC 2PL model and in the multidi- 
mensional LC Rasch model, both of them described by Bartolucci (2007). 

In all cases, the discreteness of the distribution of the random vector © implies that 
the manifest distribution of X = (Xi, . . . , Xr)' for all subjects in the c-th latent class is 
equal to 

k 

p{x) =p{X = x) = Y,P{X = x\& = OtTc, (6) 

c=l 
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where, due to the classical assumption of local independence, we have 

r 

p{x\c)=p{X = x\@ = ^,) = l[p{X, = xj\@ = a ^ 

s 

where J'd denotes the subset oi J' — {1, . . . ,r} containing the indices of the items mea- 
suring the d-th latent trait, with d — 1, . . . , s and ^cd denoting the d-th elements of 

In order to ensure the identifiability of the proposed models, suitable constraints on 
the parameters are required. With reference to the general equation (1), we require that, 
for each latent trait, one discriminating index is equal to 1 and one difficulty parameter 
is equal to 0. More precisely, let ja be a specific element of J'd, say the first. Then, when 
the discriminating indices are not constrained to be constant as in (2), we assume that 

7jd = 1, = 1, . . . , s. 

Moreover, with free item difficulties we assume that 

^,,1 = 0, d^l,...,s, (8) 

whereas with a rating scale parameterization based on (3), we assume 

Pj^^O, d = 1, . . . , s, and n = 0. (9) 

Coherently with the mentioned identifiability constraints, the number of free param- 
eters of a multidimensional LC IRT model is obtained by summing the number of free 
probabilities tTc, the number of ability parameters ^cd, the number of free item difficulty 
parameters pj^, and that of free item discriminating parameters jj. We note that the 
number of free parameters does not depend on the type of logit, but only on the type 
of parametrization assumed on item discriminating and difficulty parameters, as shown 
in Table 2. In any case, the number of free class probabihties is equal to — 1 and the 
number of ability parameters is equal to sk. However, the number of free item difficulty 
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parameters is given by [X]^=i(^i ~ 1) ~ "^l under an unconstrained difficulties parameteri- 
zation, by [(r — s) + (/ — 2)] under a rating scale parameterization, and by r — s in case 
of binary items {Ij — 2, j — 1, . . . ,r). Finally, the number of free item discriminating 
parameters is equal to (r — s) under an unconstrained discriminating parameterization, 
being otherwise. 



discriminating ditticuity 
indices levels 


Number of free parameters 
(#par) 


free free 
free constrained 

constrained free 
constrained constrained 


^k-l) + sk+[j:'j=iilj-l)-s\+ir-s) 
ik-l) + sk + [{r -s) + {l- 2)] + {r-s) 
{k-l) + sk+[ZU(h-l)-s] 
{k-l) + sk + [{r -s) + {l- 2)] 



Table 2: Number of free parameters for different constraints on item discriminating and 
difficulty parameters. 

2.2 Formulation in matrix notation 

In order to efficiently implement parameter estimation, in this section we express the 
above described class of models by using the matrix notation. For simplicity, we consider 
the case in which every item has the same number of response categories, that is Ij — I, 
j — 1, . . . ,r; the extension to the general case in which items may also have a different 
number of response categories is straightforward. In the following, by Oa we denote a 
column vector of a zeros, by Oab an a x 6 matrix of zeros, by /„ an identity matrix of 
size a, by !„ a column vector of a ones. Moreover, we use the symbol Uab to denote a 
column vector of a zeros with the 6-th element equal to one and to denote an a x a 
lower triangular matrix of ones. Finally, by <S) we indicate the Kronecker product. 

As concerns the link function in (1), it may be expressed in a general way to include 
different types of parameterizations (Glonek and McCuUagh, 1995; Colombi and Forcina, 
2001) as follows: 

g[X,{e)]^Clog[MX,{e)], (10) 

11 



where the vector g[Xj{0)] has elements gx[Xj{d)] for x — 1, — 1. Moreover, C is a 
matrix of constraints of the type 

whereas, for the global logit link, matrix M is equal to 
and for the local logit link it is equal to 

How to obtain the probability vector Xj{6) on the basis of a vector of logits defined 
as in (10) is described in Colombi and Forcina (2001), where a method to compute the 
derivative of a suitable vector of canonical parameters for Xj{0) with respect to these 
logits may be found. 

Once the ability and difficulty parameters are included in the single vector </> and 
taking into account that the distribution of © has k support points, assumption (1) may 
be expressed through the general formula 

9[MD] = ljZcj<f>, c = 1, . . . , A;, j = 1, . . . , r, 

where Zcj is a suitable design matrix. The structure of the parameter vector i/j and of 
these design matrices depend on the type of constraint assumed on the difficulty param- 
eters, as we explain below. 

When the difficulty parameters are unconstrained, </> is a column vector of size sk + 
r{l — 1) — s, which is obtained from 

{ill-, ■ ■ ■ 1 iis, • • • , iks, • • • , • • • , (^r,lr-l)' 

by removing the parameters constrained to be 0; see constraint (8). Accordingly, for 
c = 1, . . . , A; and j = 1, . . . , r, the design matrix Zcj is obtained by removing suitable 
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columns from the matrix 

where d is the dimension measured by item j. On the other hand, under a rating scale 
parameterization, is a vector of size sk + {r — s) + {I — 2) which is obtained from 

(Cll) ■ ■ ■ ) ^Is, Cfes, Pi,---, Pr, Tl,---, Tl-lY 

by removing the parameters constrained to be in (9). Accordingly, the design matrix 
Zcj is obtained by removing specific columns from 

{h-iiukc<^Usd)' h-iKj 
where, again, d is the dimension measured by item j. 

3 Likelihood inference 

In this section, we deal with likelihood inference for the models proposed in the previous 
section. In particular, we first show how to compute the model log-likelihood and how to 
maximize it by an EM algorithm. Finally, we deal with some relevant aspects of model 
selection, mainly concerning the assessment of the latent dimensions. 

3.1 Maximum likelihood estimation 

On the basis of an observed sample of dimension n, the log-likelihood of a model formu- 
lated as proposed in Section 2 may be expressed as 

X 

where r] is the vector containing all the free model parameters, Ux is the frequency of the 
response configuration x, p{x) is computed according to (6) and (7) as a function of rj, 
and by ^® mean the sum extended to all the possible response configurations x. 
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In oder to maximize £(17) with respect to rj we use an EM algorithm (Dempster et al., 
1977) that is implemented in a similar way as described in Bartolucci (2007), to which 
we refer for some details. First of all, denoting by rricx the (unobserved) frequency of the 
response configuration x and the latent class c, the complete log-likelihood is equal to 

^*(^) = ^^^c,X^Og\p{x\c)7lc]. (11) 
c X 

Now we denote by ij^ the subvector of rj which contains the free latent class probabilities 
and by T72 the subvector containing the remaining free parameters. More precisely, we let 
rji — TT, with TT = (7r2, . . . , TTfc)', and 7/2 = (7', (p')', where 7 is obtained by removing from 
(71, . . . , jr)' the parameters which are constrained to be equal to 1 to ensure identifiability. 
Obviously, 7 is not present when constraint (2) is adopted. Then, we can decompose the 
complete log-likelihood as 

with 

nim) = J^melogTTe, (12) 
c 

^IM = J^X^mVlogA,,-, (13) 

c j 

where rric = ^c,a; is the number of subjects in latent class c and rricj is the column 
vector with elements I{xj — x)mc,x-i x = 1, . . . , /j — 1, with /(•) denoting the indicator 
function. 

The EM algorithm alternates the following two steps until convergence: 

E-step: compute the conditional expected value of t{ri) given the observed data and 
the current value of the parameters; 

M-step: maximize the above expected value with respect to ry, so that this parameter 
vector results updated. 
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The E-step consists of computing, for every c and cc, the expected value of m^a; given 
follows 

p{x\c)nc 

and then substituting these expected frequencies in (11). On the basis of m^a; we can 
obtain the expected frequencies rhc and rhcj which, once substituted in (12) and (13), 
allow us to obtain the expected values of il{T]i) and ^2(^2)) denoted by il{T]i) and ^^2(^2); 
respectively. 

At the M-step, the function obtained as described above is maximized with respect 
to rj as follows. First of all, regarding the parameters in rji we have an explicit solution 
given by 

rhc 

TTc = , c — 2, . . . , k, 
n 

which corresponds to the maximum of ii{T}i). To update the other parameters, we max- 
imize ^2(^72) by a Fisher-scoring algorithm that we illustrate in the following. 

The Fisher-scoring algorithm alternates a step in which the parameter vector 7 is 
updated with a step in which the parameter vector </> is updated. The first step consists 
of adding to the current value of each free the ratio S2j/ f2j, where denotes the score 
for ^2(^2) with respect to 7^ and denotes the corresponding information computed at 
the current value of the parameters. These have the following expressions: 

c j 

f2j = 5^m,^(Ze,(/>)'i2ydiag(Ae,)-Ae,Ayi?c,(^c,</>), 

c j 

where Rcj is the derivative matrix of the canonical parameter vector for Xcj with respect 
to the vector of logits in (10); see Colombi and Forcina (2001). Then, the parameter 
vector (j) is updated by adding the quantity {F2)~^sl, where S2 is the score vector for 
£2(^2) with respect to (f) and F2 denotes the corresponding information computed at the 
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current parameter value, which have the following expressions: 





c 



3 




c 
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As usual, we suggest to initialize the EM algorithm by a deterministic rule and by 
a multi-start strategy based on random starting values which are suitable generated. In 
this way we can deal with the multimodality of the model likelihood. 

3.2 Model selection 

The formulation of a specific model in the class of multidimensional LC IRT models uni- 
vocally depends on: [i) the number of latent classes (k); {ii) the adopted parameterization 
in terms of link function gx{-), {Hi) the constraints on the item parameters 7j and Pjx, 
and (iv) the number (s) of latent dimensions and the corresponding allocation of items 
within each dimension {Sjd, j = 1, . . . ,r, d = 1, . . . ,s). 

Thus, the model selection implies the adoption of a number of choices, for each of 
the mentioned aspects, by using suitable criteria. We suggest to mostly rely on the 
likelihood ratio (LR) test to compare nested models and on the Bayesian information 
criterion (BIC; Schwarz, 1978) to obtain a relative measure of lost information when a 
given model is used to describe observed data and, therefore, to compare non-nested 
models. More precisely, the LR test is here preferred to the Wald test, because it does 
not require to compute the information matrix of the model. Moreover, the BIC index 
has to be preferred to other information criteria, because it satisfies some nice properties. 
Mainly, under certain regularity conditions it is asymptotically consistent (Keribin, 2000). 
Moreover, since it applies a larger penalty for additional parameters (for reasonable sample 
sizes) in comparison with other criteria, BIC tends to select a more parsimonious model. 

In summary, we suggest to base the selection of the number of latent classes and the 
type of logit link function on the BIC index, whereas the selection of the item discrimi- 
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nating and difficulty parameterization may be performed on tfie basis of tfie LR test. 

Tlie LR test may also be used to test the null hypothesis that items in 1^^ and 
measure the same latent trait. Therefore, being equal all the other elements of the model 
(number of latent classes, constraints on item parameters, type of logit), the general 
model with s dimensions is compared with a restricted version with s — 1 dimensions, 
where items in X^-^ and are collapsed in the same group. The LR test statistic is 
given by —2 • {1^ — ii), where Iq and li denote the maximum of log- likelihood of the 
restricted model and of the general model, respectively. Under Hq, the LR statistics is 
asymptotically distributed as a Xq^ where q is given by the difference in the number of 
parameters (see Table 2) between the two nested models being compared. 

By repeating this LR test for dimensionality in a suitable way, we can cluster items 
so that items in the same group measure the same ability. On the basis of this principle, 
Bartolucci (2007) proposed a hierarchical clustering algorithm based on Wald test and 
not yet implemented in R: its implementation for LR test is supplied in the MultiLCIRT 
package. This algorithm builds a sequence of nested models: the most general one is 
that with a separate dimension for each item (corresponding to the classic LC model) 
and the most restrictive model is that with only one dimension common to all items (uni- 
dimensional model). The clustering procedure performs r — 1 steps. At each step, the 
LR test statistic is computed for every pair of possible aggregations of items (or groups 
of items). The aggregation with the minimum value of the statistic (or equivalently the 
highest p- value) is then adopted and the corresponding model fitted before moving to the 
next step. 

The output of the above clustering algorithm may be displayed through a dendrogram 
that shows the deviance between the initial (/c-dimensional) LC model and the model 
selected at each step of the clustering procedure. Obviously, the results of a cluster analysis 
based on a hierarchical procedure depend on the adopted rule to cut the dendrogram, 
which may be chosen according to several criteria. We suggest to choose s — r — h + 1, 
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where h is the first step for that the p- value corresponding to the LR statistic is smaller 
than a suitable threshold, such as 0.05. 

4 The MultiLCIRT package 

The class of multidimensional LC IRT models described in the previous sections may 
be estimated through a set of functions implemented in the MultiLCIRT package of R 
software. In the following we illustrate the use of main functions; for related details see 
the official documentation (Bartolucci et al., 2012). 

4.1 Aggregating data 

To speed up the estimation of the proposed models and the clustering of items wc suggest 
to perform the analyses by aggregating original records having the same response pattern 
so as to obtain a matrix with a record for each distinct response configuration (rather than 
for each statistical unit). For this aim we may use function aggr_data, which requires 
as input the data matrix of unit-by-unit response configurations. The following output is 
obtained: 

• data_dis is the matrix of distinct configurations; 

• f req is the vector of corresponding frequencies; 

• label is the index of each provided response configuration among the distinct ones. 

4.2 Estimating a Multidimensional LC IRT model 

Parameters estimation for multidimensional IRT models based on discreteness of latent 
traits is performed through function est_multi_poly, which requires the following main 
input: 

• S: matrix of all response sequences observed at least once in the sample and listed 
row-by- row. Usually, S is matrix data_dis obtained by applying function aggr_data 
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to original data. Missing responses are allowed and they are coded as 999; 

• yv: vector of the frequencies of every response configuration in S, corresponding to 
output f req by function aggr_data; 

• k: number of latent classes; 

• start: method of initialization of the algorithm: for deterministic starting values, 
1 for random starting values, and 2 for arguments given by the user. In case of 
start = 2, we also need to specify as input the initial values of weights, support 
points, discriminating and difficulty item parameters; 

• link: type of link function: for the standard LC model (i.e., no link function is 
specified), 1 for global logits, and 2 for local logits. In case of binary items, it is the 
same to specify link=l or link=2; 

• disc: indicator of constraints on the discriminating item parameters: if 7j = 
1, j = 1, . . . , r and 1 if 7j 7^ 1 for at least one j = 1, . . . , r; 

• dif 1: indicator of constraints on the difficulty item parameters: if difficulties are 
free and 1 if Pjx — (3j + Tx] 

• multi: matrix with a number of rows equal to the number of dimensions and ele- 
ments in each row equal to the indices of the items measuring the dimension corre- 
sponding to that row. 

Function est_multi_poly supplies the following output: 

• piv: estimated vector of weights of the latent classes; 

• Th: estimated matrix of ability levels (support points) for each dimension and latent 
class; 

• Bee: estimated vector of difficulty item parameters (split in two vectors if dif 1=1); 
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• gac: estimated vector of discriminating item parameters; 

• f v: vector indicating the reference item chosen for each latent dimension; 

• Phi: array of the conditional response probabilities for every item and latent class; 

• Pp: matrix of the posterior probabilities for each response configuration and latent 

class; 

• Ik: log-likelhood at convergence of the EM algorithm; 

• np: number of free parameters; 

• aic: Akaike Information Criterion index (Akaike, 1973); 

• bic: Bayesian Information Criterion index (Schwarz, 1978). 

4.3 Analysis of dimensionality 

As outlined in Section 3.2, the analysis of dimensionality is based on two main procedures: 
{i) using a LR test to compare pairs of nested models that differ only by the way items 
are grouped and (ii) performing a model-based hierarchical clustering, when the grouping 
structure of items is far from clear. Procedure (i) may be carried on by function test_dim 
and procedure (ii) may be carried on by function class_item, being both of them based 
on the above described est_multi_poly function. Both of these functions rely on the 
same input than est_inulti_poly, with just some differences. 

Function class_item does not require any specification of multidimensionality struc- 
ture, being it the object of analysis, whereas function test_dim requires to specify the 
multidimensional structure of both nested models, through the following arguments: 

• multiO: matrix specifying the multidimensional structure of the restricted model, 
being the unidimensionality the default; 

• multil: matrix specifying the multidimensional structure of the larger model. 
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Output of test.dim consists in the following elements: 

• outO: output for the restricted model obtained from est_multi_poly; 

• outl: output for the larger model obtained from est_multi_poly; 

• dev: LR test statistic; 

• df : number of degrees of freedom for the LR test; 

• pv: p-value for the LR test. 

Finally, output of class.item is given by: 

• merge: list of items and/or groups of items collapsed at each step of the clustering 
procedure; 

• order: list of items sorted in a suitable way to represent the hierarchy of clusters 

through a dendrogram; 

• height: values of LR statistic obtained at each step of the clustering procedure; 

• Ik : maximum log-likelihood of the model resulting from each aggregation; 

• np: number of free parameters of the model resulting from each aggregation; 

• groups: list of groups resulting (step-by-step) from the hierarchical clustering. 
Elements merge, order, and height can be used as input to plot a dendrogram. 

5 Examples 

In the following, we describe the use of MultiLCIRT package through two applications on 
different datasets. Firstly, we illustrate the hierarchical clustering procedure of a set of 
binary items. Then, the model selection problem is taken into account by considering 
a set of ordinal polythomous items and we illustrate how choosing the number of latent 
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classes, the type of logit, and the constraints on item parameters. We also deal with 
dimensionality testing. 

5.1 Hierarchical clustering for NAEP data items 

We illustrate the hierarchical clustering procedure by using a dataset containing the re- 
sponses of a sample of 1510 examinees to 12 binary items on mathematics. It has been 
extrapolated (Bartolucci and Forcina, 2005) from a larger dataset collected in 1996 by 
the Educational Testing Service within the National Assessment of Educational Progress 
(NAEP) project. Here we rephcate the example of Bartolucci (2007) performed on the 
same data, but based on the Wald test rather than the LR test and implemented in 

Matlab. 

We start loading the data: 

> data(naep) 

> X = as .matrix (naep) 

Then, we aggregate our data through function aggr_data, so reducing the number of 
records from 1510 to 740: 

> outl = aggr_data(X) 

> S = outl$data_dis 

> yv = outl$freq 

> nrow(X) 
1510 

> iirow(S) 
704 

The hierarchical clustering of 12 items is performed through function class_item and, 
according to Bartolucci (2007), it is based on the 2PL model with k — A latent classes: 

> out2 = class_item(S, yv, k=4, link=l, disc=l) # it is the same with link = 2 

> class (out2) = "hclust" 

Results of clustering are displayed in the dendrogram obtained by command 

> plot(out2) 

and shown in Figure 1. 

Detailed results are obtained by calling outputs in object out2 as 
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Cluster Dendrogram 



Figure 1: Dendrogram for the NAEP data. 
> cbind(out2$merge, out2$height) 



[,1] 


[,2] 






[,3] 


[1.] 


-3 


-8 





2433178 


[2,] 


-5 - 


11 





6306477 


[3,] 


-4 - 


12 


1 


0835975 


[4,] 


-6 


-7 


1 


9434207 


[5,] 


-2 - 


10 


3 


0027518 


[6,] 


-9 


2 


4 


9478475 


[7,] 


-1 


3 


7 


4714241 


[8,] 


4 


7 


12 


8246846 


[9,] 


5 


6 


19 


2495712 


[10,] 


8 


9 


38 


6127909 


[11,] 


1 


10 


45 


4079171 



In the above output, the first two columns detect items (negative values) or groups 
(positive values) collapsed at the corresponding step of the clustering procedure, whereas 
the third column identifies the corresponding LR statistic. More precisely, each positive 
value h > in column 1 and 2 denotes a specific group created at the h-th step of the 
procedure. For instance, the first aggregation is about items 3 and 8. These two items 
define group 1, as it will be called in the following. The second aggregation concerns items 
5 and 11 defining group 2, and so on. Then, at sixth step item 9 is aggregated with items 
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in group 2 (i.c, items 5 and 11); at seventh step item 1 is aggregated with items in group 
3 (i.e., items 4 and 12), and so on. To visuahze all groups we digit 
> out2$group 

To simplify the interpretation we may rearrange the results from the above call in 
Table 3, where third column identifies the clusters and fourth column the corresponding 
LR statistic; for a sake of completeness we also add the p- values (last column), calculated 
on the basis of a xl distribution. 



Step h 


s 


Clusters 


LR statistics 


value 


1 


11 


|1}, |2}, |4}, |5|, |6|, |7|, |9|, |10}, 111}, |12}, |3, 8| 


0.243 


0.885 


2 


10 


{1}, {2}, {4}, {6}, {7}, {9}, {10}, {12}, {5, 11}, {3, 8} 


0.631 


0.960 


3 


9 


{1}, {2}, {6}, {7}, {9}, {10}, {4, 12}, {5, 11}, {3, 8} 


1.084 


0.982 


4 


8 


{1}, {2}, {9}, {10}, {6, 7}, {4, 12}, {5, 11}, {3, 8} 


1.943 


0.983 


5 


7 


{1}, {9}, {2, 10}, {6, 7}, {4, 12}, {5, 11}, {3, 8} 


3.003 


0.981 


6 


6 


{1}, {2, 10}, {6, 7}, {4, 12}, {5, 9, 11}, {3, 8} 


4.948 


0.960 


7 


5 


{2, 10}, {6, 7}, {1, 4, 12}, {5, 9, 11}, {3, 8} 


7.471 


0.915 


8 


4 


{2, 10}, {1, 4, 6, 7, 12}, {5, 9, 11}, {3, 8} 


12.825 


0.686 


9 


3 


{2, 5, 9, 10, 11}, {1, 4, 6, 7, 12}, {3, 8} 


19.250 


0.377 


10 


2 


{1, 2, 4, 5, 6, 7, 9, 10, 11, 12}, {3, 8} 


38.613 


0.007 


11 


1 


{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12} 


45.408 


0.002 



Table 3: Output of the hierarchical clustering algorithm for the NAEP data. 



We note that intermediate and final results of the clustering procedure, shown in Table 
3, are similar to those illustrated in Bartolucci (2007). More precisely, we can detect s = 3 
groups of items ({2, 5, 9, 10, 11), {1, 4, 6, 7, 12}, {3, 8}), corresponding to different latent 
traits. 

5.2 Model selection for HADS data 

We describe the model selection procedure and the estimation of an ordinal polythomous 
multidimensional LC IRT model by using data concerning a sample of 201 oncological 
Italian patients who were asked to fill in questionnaires about their health and perceived 
quality of life. Here we are interested in anxiety and depression, as assessed by the 
"Hospital Anxiety and Depression Scale" (HADS) developed by Zigmond and Snaith 
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(1983). The questionnaire is composed by 14 polythomous items equally divided between 
the two dimensions: 

1. anxiety (7 items: 2, 6, 7, 8, 10, 11, 12); 

2. depression (7 items: 1, 3, 4, 5, 9, 13, 14). 

All items of the HADS questionnaire have four response categories: the minimum 

value corresponds to a low level of anxiety or depression, whereas the maximum value 

3 corresponds to a high level of anxiety or depression. 

Similarly to the previously described example, we begin with loading data and aggre- 
gating them: 

> data(hads) 

> X = as .matrix (hads) 

> outl = aggr_data(X) 

> S = outl$data_dis 

> yv = outl$freq 

Then, we define a suitable matrix to accommodate assumption of bidimensionality (we 
remind that unidimensionality structure of items is treated as default in the functions of 
MultiLCIRT package): 

> dim2 = rbind(c(2,6,7,8,10,ll,12) ,c(l,3,4,5,9,13,14)) 

To proceed to the selection of the optimal model (see Section 3.2), we detect the opti- 
mal number k of latent classes. To this aim, we suggest to adopt the standard LC model 
(Goodman, 1974). In this way, no choice on the link function and the item parameteri- 
zation is requested; also, any restrictive assumptions on item dimensionality is avoided. 
Therefore, function est_multi_poly is employed and a comparison among models which 
differ by the number of latent classes is performed for increasing values of fc. As outlined 
in Section 2, we rely on BIG index, taking as optimal number of latent classes that value 
just before the first increasing of BIG. 

> outl = est_niulti_poly(S,yv,k=l,start=0, link=0) 

> out2 = est_iiiulti_poly(S,yv,k=2,start=0, link=0) 

> out3 = est_niulti_poly(S,yv,k=3,start=0, link=0) 

> out4 = est_multi_poly(S,yv,k=4,start=0, liiik=0) 
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Results are obtained by calling outputs in objects outl to out4 as follows, where 
maximum log-likelihood, number of model parameters, and BIC values for k = 1, ... ,4 
latent classes are shown in columns one, two and three, respectively. 

> rbind(cbind(outl$lk, outl$np, outl$bic) , cbiiid(out2$lk, out2$np, out2$bic) , 
+ cbind(out3$lk, out3$np, out3$bic) , cbind(out4$lk, out4$np, out4$bic)) 

[,1] [,2] [,3] 

[1,] -3153.151 42 6529.040 

[2,] -2814.635 85 6080.051 

[3,] -2677.822 128 6034.468 

[4,] -2645.435 171 6197.736 

On the basis of the adopted selection criterium, we choose k — 3 as optimal number of 
latent classes as, in correspondence of this number of latent classes, the smallest estimated 
BIC value is observed. To avoid that the choice of k falls in correspondence of a local 
- rather than a global - maximum point, we suggest to repeat the selection process by 
randomly varying the starting values of the model parameters: for each possible value of 
k the highest obtained log-likelihood value is identified and, consequently, the smallest 
estimated BIC value. For this aim, just put start=l rather than start=0 in function 
est_multi_poly. 

As regards to the following step concerning the choice of the best logit link function, 
a comparison between a model with global logit link and a model with local logit link is 
carried out on the basis of the BIC index and by assuming A; = 3 latent classes, free item 
discriminating and difficulties parameters, and a completely general multidimensional 
structure for the data (i.e., r dimensions, one for each item). 

> out31 = est_multi_poly(S,yv,k=3,start=0, link=l,disc=l,dif 1=0, 
+ multi=cbind(l:iicol(S))) # Global logit 

> out32 = est_multi_poly(S,yv,k=3,start=0, link=2,disc=l,dif 1=0, 
+ inulti=cbind(l:ncol(S))) # Local logit 

> cbind(rbind(out31$lk, out31$np, out31$bic) , rbind(out32$lk, out32$np, 
+ out32$bic)) 

[,1] [,2] 
[1,] -2726.348 -2741.321 
[2,] 72.000 72.000 
[3,] 5834.534 5864.479 
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Results show that a global logit hnk (first column) has to be preferred to a local logit 
link (second column). Also, it can be observed that a graded response type model has a 
better fit than the standard LC model, as the BIC value observed for the former is smaller 
than that detected for the latter (6197.736). 

Once we have chosen the global logit as the best link function, we carry on with 
the test of unidimensionality. An LR test is used to compare models which differ on 
account of their dimensional structure, all other elements being equal (i.e., free item 
discriminating and difficulty parameters), that is (i) a graded response model with r- 
dimensional structure, (n) a graded response model with bidimensional structure (i.e., 
anxiety and depression), as suggested by the structure of the questionnaire, and {in) a 
graded response model with unidimensional structure (i.e., all the items belonging to the 
same dimension). The LR tests are performed through function test_dim, firstly applied 
to compare models at points {i) and {ii), and then applied to compare models at points 
(ii) and (m). 

> test = test_diin(S, yv, k=3, liiik=l, disc=l, difl=0, multi0=dim2, 
+ multil=cbind(l:ncol(S))) 

Log-likelihood of the constrained model = -2731.7092 
Log-likelihood of the unconstrained model = -2726.3450 
Deviance = 10.7226 
Degrees of freedom = 12 
P-value = 0.5528 

> test = test_dim(S, yv, k=3, link=l, disc=l, difl=0, multil=dim2) 

Log-likelihood of the constrained model = -2731.8937 
Log-likelihood of the unconstrained model = -2731.7092 
Deviance = 0.3689 
Degrees of freedom = 1 
P-value = 0.5436 

The hypothesis of unidimensionality cannot be rejected. This result is coherent with a 
similar analysis performed on the same data by (Bacci and Bartolucci, pear), where item 
responses were dichotomized and a Rasch parameterization was adopted. We note that 
the test of unidimensionality may be substituted by a hierarchical clustering of items, in 
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case we should not have any idea of the grouping of items and of the latent traits measured 
by the questionnaire. 

As previously outlined, the choice of the number of parameters per item depends 
on both the presence of a constant/non-constant discriminating index (7^), and of a 
constant/non-constant threshold difficulty parameter {Pjx), for each item. In our appli- 
cation, this implicates a comparison among four graded response-type models: (i) GRM 
properly called, (ii) GRM with a rating scale parameterization (i.e., free discriminating 
item parameters and constrained difficulty item parameters) or RS-GRM (Muraki, 1990), 
(iii) GRM with constrained discriminating parameters and free difficulty parameters, also 
known as IP-GRM (Van der Ark, 2001), (iv) GRM with both constrained discriminating 
and difficulty parameters or IP- RS-GRM (Van der Ark, 2001). The parameterization is 
chosen on account of the unidimensional data structure and the previously selected global 
logit link function. Besides, because the compared models are nested, the parameteriza- 
tion is selected on the basis of an LR test. For the sake of completeness, log-likelihood 
and BIG values are also provided for each model considered. 

# Unidimensional GRM 

> outSll = est_inulti_poly(S, yv, k=3, steLrt=0, link=l, disc=l, difl=0) 

# Unidimensional RS-GRM 

> outSlll = est_multi_poly(S, yv, k=3, start=0, link=l, disc=l, difl=l) 

# Unidimensional IP-GRM 

> out3112 = est_multi_poly(S, yv, k=3, start=0, link=l, disc=0, difl=0) 

# Unidimensional IP-RS-GRM 

> out3113 = est_multi_poly(S, yv, k=3,start=0, link=l, disc=0, difl=l) 

# comparison between RS-GRM vs GRM 

> Devi = -2*(out3111$lk - out311$lk) 

> dfl = out311$np - out3111$np 

> Pvl = l-pchisq(q = Devi, dfl) 

# comparison between IP-GRM vs GRM 

> Dev2 = -2*(out3112$lk - out311$lk) 

> df2 = out311$np - out3112$np 

> Pv2 = l-pchisq(q = Dev2, df2) 

# comparison between IP-RS-GRM vs IP-GRM 

> Dev3 = -2*(out3113$lk - out3112$lk) 

> df3 = out3112$np - out3113$np 
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> Pv3 = l-pchisq(q = Dev3, df3) 

> rbind(cbind(disc=l, difl=0, lk=out311$lk, np=out311$np, BIC=out311$bic, 
+ LR="", p="", " "), 

+ cbind(disc= 1, difl=l, out3111$lk, out3111$np, out3111$bic, Devi, Pvl, 
+ "(RS-GRM vs GRM)"), 

+ cbind(disc=0, difl=0, out3112$lk, out3112$np, out3112$bic, Dev2, Pv2, 
+ "(IP-GRM vs GRM)") , 

+ cbind(disc=0, difl=l, out3113$lk, out3113$np, out3113$bic, Dev3, Pv3, 
+ "(IP-RS-GRM vs IP-GRM)")) 

disc difl Ik np BIG LR p 

[1,] 1 -2731.894 59 5776.682 NaN NaN " " 

[2,] 1 1 -2795.570 33 5766.149 127.353 0.000 "(RS-GRM vs GRM)" 

[3,] -2741.285 46 5726.521 18.782 0.130 "(IP-GRM vs GRM)" 

[4,] 1 -2844.518 20 5795.102 206.467 0.000 "(IP-RS-GRM vs IP-GRM) 



The analyses show that between GRM and RS-GRM, GRM has to be preferred to RS- 
GRM, while between models GRM and IP-GRM, the latter has to be preferred. Besides, 
as model IP-GRM has a better fit than model IP-RS-GRM, then IP-GRM has to be 
preferred model among the four considered, that is the graded response type models with 
free parameters and constant 7^- parameters. Such a result is achieved by taking into 
account both the BIG criterium and the LR test. 

As the sequence of the previously described steps may be considered partly arguable, it 
can be also shown that the same results - in terms of link function, item parameterization 
and dimensionality choice - would have been obtained if each of such models were com- 
pared at once accounting for log-hkehhood and BIG values as selection criteria. Indeed, 
Table 4 shows that the smallest BIG value is observed when selecting: (z) a global logit 
link function; [ii) constrained 7^ parameters and free j3jx parameters, that is, a IP-GRM 
model; and {in) assuming a unidimensional structure for the data. 

The estimates of support points and probabilities ttc, c = 1,2, 3, under the selected 
unidimensional IP-GRM model are extrapolated from object out3112. 

> rbind(out3112$Th, out3112$piv) 

[,1] [,2] [,3] 

[1,] -0.7757623 1.1830799 3.4185177 
[2,] 0.3419038 0.4911631 0.1669331 
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D imensionaiity 


item parameters 


Global logit 


Local logit 




Ti Pix 






r-dimensional 


free/constr. 
free/constr. 


free 
constrained 


-2726.347 5834.534 
-2815.568 5875.088 


-2741.321 5864.479 
-2836.766 5917.484 


bidimensionai 


free 
constrained 

free 
constrained 


free 
free 

constrained 
constrained 


-2731,249 5780,696 
-2740,658 5735,875 
-2798,959 5778,230 
-2843,227 5803,127 


-2749,839 5817,877 
-2764,787 5784,132 

-2835,611 5851,534 
-2869,223 5855,120 


unidimensionai 


free 
constrained 

free 
constrained 


free 

free 
constrained 
constrained 


-2731,894 5776,682 
-2741,285 5726,521 
-2795,570 5766,149 
-2844,518 5795,102 


-2750,214 5813,323 
-2765,129 5774,211 
-2833,179 5841,366 
-2870,178 5846,422 



Table 4: Log-likelihood and BIC values for the global and local logit link functions, taking 
into account the dimensional structure (r-dimensional /bidimensionai/ unidimensionai) 
and the item parameters (depending on whether they are free/ constraint); in boldface is 
the smallest BIC value. 



On the basis of these results, we conclude that patients who suffer from psychopato- 
logical disturbs are mostly represented in the first two classes, whereas only the 16.7% of 
the subjects belong to the third class. Furthermore, patients belonging to class 1 present 
the least severe conditions, whereas patients in class 3 present the worst conditions. 

To conclude, we outline that the proposed model selection procedure results to be 
simplified in case of binary items, being the selection of logit link function and the selection 
of item difficulties not requested. 



6 Conclusions 

In this paper, we initially illustrate a class of models that extends traditional IRT models 
allowing for {i) multidimensionality, {ii) discreteness of latent traits, and {iii) item re- 
sponses of different nature, in a same unifying framework. In particular, the assumption 
of multidimensionality allows us to take more than one latent trait into account at the 
same time and to study the correlation between latent traits. Moreover, through the 
introduction of latent classes, homogeneous subpopulations of subjects are detected and 
a notable simplification from the computational point of view results with respect to the 
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case of continuous latent traits, where the marginal likelihood is characterized by a multi- 
dimensional integral difficult to treat. Lastly, the proposed class of models is formulated 
to treat with both binary and ordinal polythomous items and, in the latter case, different 
link functions and item parameterizations are provided. 

In order to make inference on the proposed model, we show how the log-likelihood may 
be efficiently maximized by the EM algorithm. We also propose a model selection proce- 
dure to choose the different features that contribute to define a specific multidimensional 
LC IRT model. In general, comparisons between different parameterizations are based on 
information criteria, in particular we rely on the Bayesian Information Criterion, or on 
likelihood ratio test, being this last tool useful in presence of nested models. 

The estimation of the multidimensional LC IRT models is made possible by the avail- 
ability of R package MultiLCIRT (Bartolucci et al., 2012), the use of which is illustrated 
through the description of the main functions and two applications to real data. The first 
application is about data collected by the Educational Testing Service within the National 
Assessment of Educational Progress (NAEP) project and it implements the model-based 
hierarchical clustering procedure for a set of binary items. The results are coherent with 
those obtained by Bartolucci (2007), showing that items may be grouped in three homo- 
geneous groups. The second application is about data concerning the measurement of 
psychopathological disturbs through the Hospital Anxiety and Depression Scale (HADS). 
By this application we show how performing the model selection in presence of ordinal 
polythomous data. The results show that subjects can be classified in three latent classes 
and the item responses can be explained by a graded response type model with items hav- 
ing the same discriminating power and different distances between consecutive response 
categories. The bidimensionality assumption is rejected in favor of unidimensionality, so 
that all items of the questionnaire measure the same latent psychopathological disturb. 
All R replication scripts for the proposed examples are provided. 
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